Skip to content

Avoid panics in NgxListIterator, ngx_str_t::to_str #183

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 28, 2025

Conversation

pchickey
Copy link
Contributor

@pchickey pchickey commented Jul 25, 2025

Proposed changes

ngx::http::NgxListIterator: ngx_str_t items may not be str:

The ngx_str_items in the header name and value are often untrusted
input, and may not have utf-8 contents. The use of ngx_str_t::to_str
in this iterator will panic when the contents are not utf-8. So, instead
of yielding a pair of strs here, yield a pair of &NgxStr.

ngx_str_t::to_str should not unwrap

In general, we should avoid exposing methods that panic when its
reasonable to return a Result instead. This particular method was used,
likely without considering that it may panic or validating that the
contents are utf-8, in ngx::http::NgxListIterator's Iterator impl,
which made it very easy to panic based on untrusted input.

Checklist

Before creating a PR, run through this checklist and mark each as complete.

  • I have written my commit messages in the Conventional Commits format.
  • I have read the CONTRIBUTING doc
  • I have added tests (when possible) that prove my fix is effective or that my feature works
  • I have checked that all unit tests pass after adding my changes
  • I have updated necessary documentation
  • I have rebased my branch onto main
  • I will ensure my PR is targeting the main branch and pulling from my branch from my own fork

@pchickey pchickey force-pushed the pch/str_conversions branch from 67c967a to 479f08c Compare July 25, 2025 23:04
@pchickey
Copy link
Contributor Author

The NgxListIterator bug was discovered by @xeioex so I'll tag him for review

@xeioex
Copy link
Contributor

xeioex commented Jul 25, 2025

@pchickey

The ngx_str_items in the header name and value are often untrusted
input, and may not have utf-8 contents. The use of ngx_str_t::to_str
in this iterator will panic when the contents are not utf-8. So, instead
of yielding a pair of strs here, yield a pair of byte slices.

I believe we need more time to think for this Header iterator interface as there are a lot of things to consider.
For example I am looking into hyper:

Represents an HTTP header field value.

In practice, HTTP header field values are usually valid ASCII. However, the HTTP spec allows for a header value to contain opaque bytes as well. In this case, the header field value is not able to be represented as a string.

To handle this, the HeaderValue is useable as a type and can be compared with strings and implements Debug. A to_str fn is provided that returns an Err if the header value contains non visible ascii characters.

Also looking at RFC9110

  field-value    = *field-content
  field-content  = field-vchar
                   [ 1*( SP / HTAB / field-vchar ) field-vchar ]
  field-vchar    = VCHAR / obs-text
  obs-text       = %x80-FF
  
  
A field value does not include leading or trailing whitespace. When a specific version of HTTP allows such whitespace to appear in a message, a field parsing implementation MUST exclude such whitespace prior to evaluating the field value.

Field values are usually constrained to the range of US-ASCII characters [[USASCII](https://www.rfc-editor.org/rfc/rfc9110.html#USASCII)]. Fields needing a greater range of characters can use an encoding, such as the one defined in [[RFC8187](https://www.rfc-editor.org/rfc/rfc9110.html#RFC8187)]. Historically, HTTP allowed field content with text in the ISO-8859-1 charset [[ISO-8859-1](https://www.rfc-editor.org/rfc/rfc9110.html#ISO-8859-1)], supporting other charsets only through use of [[RFC2047](https://www.rfc-editor.org/rfc/rfc9110.html#RFC2047)] encoding. Specifications for newly defined fields SHOULD limit their values to visible US-ASCII octets (VCHAR), SP, and HTAB. A recipient SHOULD treat other allowed octets in field content (i.e., [obs-text](https://www.rfc-editor.org/rfc/rfc9110.html#fields.values)) as opaque data.

Field values containing CR, LF, or NUL characters are invalid and dangerous, due to the varying ways that implementations might parse and interpret those characters; a recipient of CR, LF, or NUL within a field value MUST either reject the message or replace each of those characters with SP before further processing or forwarding of that message. Field values containing other CTL characters are also invalid; however, recipients MAY retain such characters for the sake of robustness when they appear within a safe context (e.g., an application-specific quoted string that will not be processed by any downstream HTTP parser).

Treating the headers as bytes simplifies things for us in ngx-rust, but I am not sure it is very convenient for users.

@pchickey pchickey force-pushed the pch/str_conversions branch from 479f08c to 0afa1af Compare July 28, 2025 17:46
@pchickey
Copy link
Contributor Author

Rewritten according to feedback

@pchickey pchickey requested review from xeioex and bavshin-f5 July 28, 2025 17:47
Copy link
Member

@bavshin-f5 bavshin-f5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just one small nit.

@pchickey pchickey force-pushed the pch/str_conversions branch from 0afa1af to 8c30f7c Compare July 28, 2025 21:04
@pchickey pchickey requested a review from bavshin-f5 July 28, 2025 21:04
@pchickey pchickey changed the title Avoid panics in NgxListIterator, rename to_str to as_str where appropriate Avoid panics in NgxListIterator, to_str Jul 28, 2025
@pchickey pchickey changed the title Avoid panics in NgxListIterator, to_str Avoid panics in NgxListIterator, ngx_str_t::to_str Jul 28, 2025
Copy link
Contributor

@xeioex xeioex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ngx::http::NgxListIterator: ngx_str_t items are NgxStr, not str

to better adhere to commit log notation in the repo I suggest to replace it to something like

fix: use NgxStr in NgxListIterator to prevent UTF-8 panics

and
ngx_str_t::to_str should not unwrap.

to

fix(sys): replace panicking ngx_str_t::to_str with fallible variant

otherwise looks good

pchickey added 2 commits July 28, 2025 14:33
ngx::http::NgxListIterator ngx_str_t items are now NgxStr, not str.

The ngx_str_items in the header name and value are often untrusted
input, and may not have utf-8 contents. The use of ngx_str_t::to_str
in this iterator will panic when the contents are not utf-8. So, instead
of yielding a pair of strs here, yield a pair of &NgxStr, which is like
ngx_str_t but with more methods.
ngx_str_t::to_str should not contain an unwrap.

In general, we should avoid exposing methods that panic when its
reasonable to return a Result instead. This particular method was used,
likely without considering that it may panic or validating that the
contents are utf-8, in ngx::http::NgxListIterator's Iterator impl,
which made it very easy to panic based on untrusted input.
@pchickey pchickey force-pushed the pch/str_conversions branch from 8c30f7c to 11bdc20 Compare July 28, 2025 21:34
@pchickey pchickey requested a review from xeioex July 28, 2025 21:34
@pchickey
Copy link
Contributor Author

Commit titles rewritten

@bavshin-f5 bavshin-f5 merged commit 5890854 into nginx:main Jul 28, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants